What's not done / known gaps by ArksherX · Pull Request #5 · SL5TaskForce/agent-gateway

ArksherX · 2026-05-16T14:09:39Z

Summary

Adds per-identity byte-rate limiting to the tunnelled data proxied through the gateway. Each agent identity (extracted from the mTLS client certificate extension) gets its own token bucket rate limiter. If an identity exceeds its configured throughput limit, the copy loop is throttled automatically.

Components of the Gateway (task item 1)

The gateway is composed of the following parts:

main.rs — Entry point. Loads config, sets up TLS, starts the TCP listener, and dispatches connections to MakeProxyService.
proxy.rs — Core request handling. Implements the Service trait for hyper, extracts the destination from the CONNECT request, calls the policy engine, opens the upstream TCP connection, and spawns the bidirectional tunnel.
policy.rs — Authorization logic. Extracts the agent identity from the custom X.509 certificate extension, queries PostgreSQL to check for a valid signed permission row, and returns Allow or Deny.
config.rs — Typed configuration structs deserialized from config.toml.
tls.rs — Sets up the rustls server config for mTLS, requiring and verifying client certificates.
rate_limit.rs (new) — Per-identity token bucket rate limiters backed by governor and stored in a DashMap.

What the feature does

Reads [rate_limit] from config.toml — bytes_per_second and burst_bytes
Maintains an in-memory DashMap<String, Arc<RateLimiter>> keyed by agent identity
When a tunnel is spawned, if bytes_per_second > 0, the bidirectional copy is wrapped with rate limiter checks per identity
Logs when rate limiting is active for a connection

Design choices and tradeoffs

In-memory over persistent storage
State lives in a DashMap on the heap. This means limits reset on gateway restart and are not shared across multiple gateway instances. This is the right tradeoff for a single-node SL5 weight enclave deployment — restarts are controlled events, and the SL5 threat model assumes a single-facility enclave. Adding distributed state (e.g. a Postgres counter per identity per time window) would be the correct next step for multi-node deployments, at the cost of a database roundtrip per data chunk.

Global config, not per-identity config
All identities share the same bytes_per_second and burst_bytes values from config. Per-identity limits would require either a config map keyed by identity string or a new database column — straightforward to add but out of scope for this implementation given the simplicity priority.

What it protects against and what it doesn't
The rate limit addresses sustained bulk exfiltration — an agent continuously streaming large volumes of data will be throttled. It does not address short bursts below the window duration, and it does not address an adversary who controls multiple distinct identities. It is a bandwidth control, not a session control.

Crate choice: governor
governor provides a well-tested token bucket implementation with no_std support and minimal dependencies. RateLimiter::direct with a Quota::per_second is the simplest correct primitive for bytes-per-second limiting.

Implementation process and tools used

Read the upstream codebase to understand where the bidirectional copy happens (spawn_tunnel in proxy.rs)
Used Claude (Anthropic) extensively for Rust-specific guidance: fixing lifetime errors with tokio::spawn, resolving clippy pedantic lints (needless_pass_by_value, clone_on_copy), and deriving Copy on RateLimitConfig to satisfy both the borrow checker and clippy simultaneously
All design decisions were made independently; Claude was used as a Rust reference, not an architect